feat: Add support for `nw.int_range` for eager backends #2895

FBruzzesi · 2025-07-26T20:38:23Z

What type of PR is this? (check all applicable)

Checklist

Code follows style guide (ruff)
Tests added
Documented the changes

If you have comments or can explain your changes, please do so below

Related issue: #2722

I am slightly concerned that this could turn out to be a month long PR. I will open it as draft for now. There are a couple of pain points I already know:

~~I was very careful with type hints, yet it somehow doesn't pass type checker~~ Fixed
A custom check (importing from dtypes) will fail. That's the default used in the dtype argument. In principle we could just ignore the argument completely and always opt for Int64.
Polars can do eager: bool. In our case it's a bit more complex than that. Instead of adding yet another argument to the function, I allowed for eager to be False (default), None or the backend/implementation that should back the series.

narwhals/_arrow/namespace.py

tests/expr_and_series/int_range_test.py

tests/v1_test.py

dangotbanned · 2025-07-27T12:31:16Z

Hey @FBruzzesi, as I've mentioned 8347839 times, I'm very keen to see this in narwhals! 😄

Would you be okay if we let this one marinate for a bit and maybe target the release after the next? 🙏

narwhals/_arrow/series.py

FBruzzesi · 2025-07-27T13:07:05Z

Hey @FBruzzesi,as I've mentioned 8347839 times, I'm very keen to see this in narwhals! 😄

🎉

Would you be okay if we let this one marinate for a bit and maybe target the release after the next? 🙏

Yeah I was not expecting this to be finalized by tomorrow. It has some sharp edges that need some love and work

FBruzzesi · 2025-07-28T14:29:34Z

@MarcoGorelli any "trick" to avoid the custom check in precommit?

For compliant namespaces the import of IntegerType is in the if TYPE_CHECKING block
For function.py, Int64 is used as default value (and then overwritten by the v*.dtypes.Int64 dtype). I guess one way of doing it is to let it be None and set it inside the function given the version. Yet It's a bit less explicit to a user.

I thought about emulating that utils/import_check.py does, but I would need to figure out what's happening there from scratch 😂

dangotbanned · 2025-07-28T14:41:10Z

any "trick" to avoid the custom check in precommit?

You could add this to nw.typing, and then use that everywhere instead?

IntegerDType: TypeAlias = "dtypes.IntegerType | type[dtypes.IntegerType]"

.pre-commit-config.yaml

narwhals/functions.py

#2895 (comment)

#2982 removed from everywhere else

> TypeError: argument 'step': 'PolarsExpr' object cannot be interpreted as an integer https://github.com/narwhals-dev/narwhals/actions/runs/17499840563/job/49709898155?pr=2895

related #3096 (comment)

dangotbanned · 2025-09-13T09:53:57Z

this is probably fine, i'd just like to take a look before merging please, thanks 🙏

@MarcoGorelli pinging again as (#2895 (comment)) was a month ago 🫣

dangotbanned · 2025-11-04T13:41:39Z

@FBruzzesi I still have hope that eventually we'll land this 🙏

I have an idea of how we could reuse int_range to implement date_range for pyarrow.

Idea

As an example, if we convert the output of pl.date_range to pyarrow, we can see the pa.DataType as:

import polars as pl
import pyarrow as pa

start, end = dt.date(2000, 1, 1), dt.date(2000, 1, 5)
expr = pl.date_range(start, end).alias("date")
table = pl.select(expr).to_arrow()
table.column("date").type

See date32

DataType(date32[day])

Since that is just represented as a 32-bit integer, we can do things like:

dates = table.column("date")
dates

Show Output

<pyarrow.lib.ChunkedArray object at 0x00000296A456BDC0>
[
  [
    2000-01-01,
    2000-01-02,
    2000-01-03,
    2000-01-04,
    2000-01-05
  ]
]

dates.cast(pa.int32())

Show Output

<pyarrow.lib.ChunkedArray object at 0x00000296A456B8E0>
[
  [
    10957,
    10958,
    10959,
    10960,
    10961
  ]
]

dates.cast(pa.int32()).cast(pa.date32())

Show Output

<pyarrow.lib.ChunkedArray object at 0x00000296A456BAC0>
[
  [
    2000-01-01,
    2000-01-02,
    2000-01-03,
    2000-01-04,
    2000-01-05
  ]
]

In practice

import datetime as dt

import pyarrow as pa


def date_range(
    start: dt.date,
    end: dt.date,
    interval: int,  # (* assuming the `Interval` part is solved)
    *,
    closed: ClosedInterval = "both",
) -> pa.Date32Array:
    start_i = pa.scalar(start).cast(pa.int32()).as_py()
    end_i = pa.scalar(end).cast(pa.int32()).as_py()

    # call `int_range` here for the compatibility branch
    arr = pa.arange(start_i, end_i + 1, interval)
    if closed != "both":
        if closed == "left":
            arr = arr.slice(length=len(arr) - 1)
        elif closed == "none":
            arr = arr.slice(1, len(arr) - 1)
        else:
            arr = arr.slice(1)

    # the first cast would happen in `int_range(dtype=...)`
    return arr.cast(pa.int32()).cast(pa.date32())


start, end = dt.date(2000, 1, 1), dt.date(2000, 2, 1)

date_range(start, end, interval=7, closed="none")
<pyarrow.lib.Date32Array object at 0x00000296A74176A0>
[
  2000-01-08,
  2000-01-15,
  2000-01-22,
  2000-01-29
]

What's the catch?

We'd need to adapt Interval to support a different kind of parse.

Show Interval

narwhals/narwhals/_duration.py

Lines 49 to 94 in 01aab21

    
           class Interval: 
        
               def __init__(self, multiple: int, unit: IntervalUnit, /) -> None: 
        
                   self.multiple: int = multiple 
        
                   self.unit: IntervalUnit = unit 
        
               def to_timedelta( 
        
                   self, *, unsupported: Container[IntervalUnit] = frozenset(("ns", "mo", "q", "y")) 
        
               ) -> dt.timedelta: 
        
                   if self.unit in unsupported:  # pragma: no cover 
        
                       msg = f"Creating timedelta with {self.unit} unit is not supported." 
        
                       raise NotImplementedError(msg) 
        
                   kwd = UNIT_TO_TIMEDELTA[self.unit] 
        
                   # error: Keywords must be strings (bad mypy) 
        
                   return dt.timedelta(**{kwd: self.multiple})  # type: ignore[misc] 
        
               @classmethod 
        
               def parse(cls, every: str) -> Interval: 
        
                   multiple, unit = cls._parse(every) 
        
                   if unit == "mo" and multiple not in MONTH_MULTIPLES: 
        
                       msg = f"Only the following multiples are supported for 'mo' unit: {MONTH_MULTIPLES}.\nGot: {multiple}." 
        
                       raise ValueError(msg) 
        
                   if unit == "q" and multiple not in QUARTER_MULTIPLES: 
        
                       msg = f"Only the following multiples are supported for 'q' unit: {QUARTER_MULTIPLES}.\nGot: {multiple}." 
        
                       raise ValueError(msg) 
        
                   if unit == "y" and multiple != 1: 
        
                       msg = ( 
        
                           f"Only multiple 1 is currently supported for 'y' unit.\nGot: {multiple}." 
        
                       ) 
        
                       raise ValueError(msg) 
        
                   return cls(multiple, unit) 
        
               @classmethod 
        
               def parse_no_constraints(cls, every: str) -> Interval: 
        
                   return cls(*cls._parse(every)) 
        
               @staticmethod 
        
               def _parse(every: str) -> tuple[int, IntervalUnit]: 
        
                   if match := PATTERN_INTERVAL.match(every): 
        
                       multiple = int(match["multiple"]) 
        
                       unit = cast("IntervalUnit", match["unit"]) 
        
                       return multiple, unit 
        
                   msg = ( 
        
                       f"Invalid `every` string: {every}. Expected string of kind <number><unit>, " 
        
                       f"where 'unit' is one of: {get_args(IntervalUnit)}." 
        
                   ) 
        
                   raise ValueError(msg)

It would be restricted to only d, w, mo, q, y - but even within that - maybe more flexible given that we know +1d is equivalent to +1 🤔?

dangotbanned · 2025-11-04T17:13:21Z

narwhals/functions.py

+@unstable
+def int_range(
+    start: int | Expr,
+    end: int | Expr | None = None,
+    step: int = 1,
+    *,
+    dtype: IntegerDType = Int64,
+    eager: IntoBackend[EagerAllowed] | Literal[False] = False,
+) -> Expr | Series[Any]:


Note
I don't think this blocks anything - just a realization from me 🙂

I hadn't been able to put my finger on what seemed off to me wrt this until just now:

Polars can do eager: bool. In our case it's a bit more complex than that.
Instead of adding yet another argument to the function, I allowed for eager to be False (default), None or the backend/implementation that should back the series.

This trick would work for us in all the cases where eager defines Expr | Series.
That is most of them, but there is polars.select as the ugly duckling with:

DataFrame | LazyFrame

@overload def select( *exprs: IntoExpr | Iterable[IntoExpr], eager: Literal[True] = ..., **named_exprs: IntoExpr, ) -> DataFrame: ... @overload def select( *exprs: IntoExpr | Iterable[IntoExpr], eager: Literal[False], **named_exprs: IntoExpr, ) -> LazyFrame: ... def select( *exprs: IntoExpr | Iterable[IntoExpr], eager: bool = True, **named_exprs: IntoExpr ) -> DataFrame | LazyFrame:

If we added select, then we'd need two arguments to be able to say whether we want pl.DataFrame or pl.LazyFrame

def select( *exprs: IntoExpr | Iterable[IntoExpr], backend: IntoBackend[Backend], eager: bool = True, **named_exprs: IntoExpr, ) -> DataFrame | LazyFrame: ...

But that also seems a bit footgun-y, since something like select(..., backend="duckdb") would have a conflicting default.

FBruzzesi added 5 commits July 26, 2025 19:00

Eager mode

cd8b49d

lazy WIP

74b94c1

merge main

81f12fa

fixed eager

c775ccd

add docs, cleanse a bit

b3ba810

FBruzzesi added enhancement New feature or request pyarrow Issue is related to pyarrow backend pandas-like Issue is related to pandas-like backends polars Issue is related to polars backend labels Jul 26, 2025

FBruzzesi commented Jul 26, 2025

View reviewed changes

narwhals/_arrow/namespace.py Outdated Show resolved Hide resolved

FBruzzesi added 2 commits July 26, 2025 23:02

fix or ignore typing issues

9ee6209

skip if impl not installed

a3496ed

FBruzzesi commented Jul 26, 2025

View reviewed changes

tests/expr_and_series/int_range_test.py Outdated Show resolved Hide resolved

overloads?

dc263c8

FBruzzesi commented Jul 26, 2025

View reviewed changes

tests/v1_test.py Show resolved Hide resolved

fix overloads

f612646

FBruzzesi marked this pull request as ready for review July 27, 2025 09:17

dangotbanned reviewed Jul 27, 2025

View reviewed changes

narwhals/_arrow/series.py Outdated Show resolved Hide resolved

dangotbanned mentioned this pull request Jul 28, 2025

[Enh]: Support int_range #2722

Open

FBruzzesi added 2 commits July 28, 2025 16:17

merge main and add to v2

5a46579

add in v2.__all__

b8b6ae2

FBruzzesi added 3 commits July 28, 2025 16:53

factor out _native_int_range into utils

22c52eb

resolve majority of typing and import issues

8f4f647

replace all type hints with IntegerDType, ignore import in functions

7fb18bb

FBruzzesi commented Jul 28, 2025

View reviewed changes

.pre-commit-config.yaml Outdated Show resolved Hide resolved

Dan's suggestion

dde4a99

FBruzzesi and others added 9 commits August 6, 2025 11:11

Merge branch 'main' into feat/int-range

3b96967

Merge branch 'main' into feat/int-range

c98740b

Merge branch 'main' into feat/int-range

a7d4e25

Merge remote-tracking branch 'upstream/main' into feat/int-range

190590b

Merge branch 'main' into feat/int-range

cf5799a

Merge branch 'main' into feat/int-range

f7e5c9a

Merge remote-tracking branch 'upstream/main' into feat/int-range

5c7e6e6

Merge branch 'main' into feat/int-range

5484863

Merge branch 'main' into feat/int-range

ec5ee4b

dangotbanned mentioned this pull request Aug 16, 2025

[Enh]: Support for top-level function pl.repeat() #3000

Open

dangotbanned added 2 commits August 17, 2025 17:01

Merge remote-tracking branch 'upstream/main' into feat/int-range

813c101

Merge remote-tracking branch 'upstream/main' into feat/int-range

9584b0f

dangotbanned reviewed Aug 19, 2025

View reviewed changes

narwhals/functions.py Outdated Show resolved Hide resolved

Merge branch 'main' into feat/int-range

7e81d46

dangotbanned marked this pull request as draft August 20, 2025 17:18

dangotbanned added 2 commits August 20, 2025 17:18

refactor(typing): Use IntoBackend[EagerAllowed]

0c6495f

#2895 (comment)

docs: Remove Returns sections

45cf54a

#2982 removed from everywhere else

dangotbanned marked this pull request as ready for review August 20, 2025 17:31

FBruzzesi and others added 8 commits August 24, 2025 11:32

merge main

abe3da7

Merge branch 'main' into feat/int-range

fb4cfda

Merge branch 'main' into feat/int-range

e8739e5

Merge remote-tracking branch 'upstream/main' into feat/int-range

1d8602b

fix: Update for (#3045)

fb6c328

fix: Don't treat step as an Expr

35ece71

> TypeError: argument 'step': 'PolarsExpr' object cannot be interpreted as an integer https://github.com/narwhals-dev/narwhals/actions/runs/17499840563/job/49709898155?pr=2895

Merge remote-tracking branch 'upstream/main' into feat/int-range

afd35f6

chore(typing): fix incompatible override

0de173e

related #3096 (comment)

Merge remote-tracking branch 'upstream/main' into feat/int-range

80c422a

dangotbanned reviewed Nov 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Add support for `nw.int_range` for eager backends #2895

feat: Add support for `nw.int_range` for eager backends #2895

FBruzzesi commented Jul 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dangotbanned commented Jul 27, 2025 •

edited

Loading

Uh oh!

Uh oh!

FBruzzesi commented Jul 27, 2025

Uh oh!

FBruzzesi commented Jul 28, 2025

Uh oh!

dangotbanned commented Jul 28, 2025

Uh oh!

Uh oh!

Uh oh!

dangotbanned commented Sep 13, 2025

Uh oh!

dangotbanned commented Nov 4, 2025

Uh oh!

dangotbanned Nov 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: Add support for nw.int_range for eager backends #2895

Are you sure you want to change the base?

feat: Add support for nw.int_range for eager backends #2895

Conversation

FBruzzesi commented Jul 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What type of PR is this? (check all applicable)

Checklist

If you have comments or can explain your changes, please do so below

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dangotbanned commented Jul 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

FBruzzesi commented Jul 27, 2025

Uh oh!

FBruzzesi commented Jul 28, 2025

Uh oh!

dangotbanned commented Jul 28, 2025

Uh oh!

Uh oh!

Uh oh!

dangotbanned commented Sep 13, 2025

Uh oh!

dangotbanned commented Nov 4, 2025

Idea

In practice

What's the catch?

Uh oh!

dangotbanned Nov 4, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: Add support for `nw.int_range` for eager backends #2895

feat: Add support for `nw.int_range` for eager backends #2895

FBruzzesi commented Jul 26, 2025 •

edited

Loading

dangotbanned commented Jul 27, 2025 •

edited

Loading